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Abstract 

We present the first semi-streaming algorithms to determine fc-connectivity 
of an undirected graph with k being any constant. The semi-streaming 
model for graph algorithms was introduced by Muthukrishnan in 2003 
and turns out to be useful when dealing with massive graphs streamed in 
from an external storage device. 

Our two semi-streaming algorithms each compute a sparse subgraph 
of an input graph G and can use this subgraph in a postprocessing step to 
decide fc-connectivity of G. To this end the first algorithm reads the input 
stream only once and uses time 0(k 2 n) to process each input edge. The 
second algorithm reads the input fc + 1 times and needs time 0(k + a(n)) 
per input edge. Using its constructed subgraph the second algorithm can 
also generate all /-separators of the input graph for all / < fc. 



Keywords: graph, semi-streaming algorithm, connectivity, vertex connectivity, 
separator 

1 Introduction 

Semi-Streaming Model. In the recent years the computational model of 
streaming algorithms has gained popularity, not only because of its interesting 
theoretical implications but also due to its usefulness in practice. Real-world 
applications are facing an increasing amount of data, needing the ability to deal 
with massive amounts of information. Examples vary from oceanographic and 
atmospheric data to the huge databases of data warehousing. It is common that 
the input data size of this kind can easily reach terabytes or petabytes. Thus 
the traditional approaches of algorithms having random access to the input are 
not useful here. On the contrary it cannot be taken for granted that the whole 
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input is available in the memory of the algorithm, it is rather stored on disk or 
tape. For developing time-efficient algorithms working on these storage devices 
it is reasonable to assume the input of the algorithm (which is the output of 
the storage devices) to be a sequential stream. While tapes produce a stream 
as their natural output, disks reach much higher output rates when presenting 
their data sequentially in the order it is stored. 

This is where streaming algorithms are placed in position. They provide a 
computational model useful for dealing with large amounts of data stored in 
external devices. In the classical data stream model [13], [15] the input data 
can only be accessed sequentially as a data stream. The streaming algorithm 
has to process this input using a working memory that is small compared to 
the length of the input stream. In particular the algorithm is unable to store 
the whole input and therefore has to make space-efficient summarizations of the 
input according to the query to be answered. 

Much of the previous work in the area of streaming models is focused on 
generating statistical values for a stream of input elements. There are streaming 
algorithms approximating frequency moments [1], computing histograms [12] 
and wavelet decompositions [11]. For a comprehensive overview the reader is 
referred to [15] and the references therein. 

Real-world applications often deal with data modeled as a graph G(V 7 E) 
composed of vertices V and edges E. One example is the call graph of telecom- 
munication providers modeling the users as vertices and the telephone calls as 
edges between them. A second example is the structure of the WWW where 
pages are vertices and links correspond to edges. Both are massive graphs and 
answering queries on these graphs means to solve graph theoretical problems on 
a huge amount of input. 

The traditional streaming model [15] restricts an algorithm on a graph with 
n vertices to a memory size of o(n) bits. That does not suffice even to solve basic 
graph problems [8]. Therefore Muthukrishnan [15] proposed the semi- streaming 
model for handling graph issues in the context of streaming: Given a graph 
G(V,E), n = \V\ and m — \E\, a semi-streaming algorithm is presented an ar- 
bitrary order of the edges of G as a stream. The algorithm can only access this 
input sequentially in the order it is given; it might process the input stream sev- 
eral times. The algorithm has a working memory consisting of 0(n ■ polylogn) 
bits, thus there is space to store the vertices but not enough to store the edges 
of G if G is a dense graph, i.e., n — o(m). 

There have been some successful considerations of graph problems in the 
semi-streaming model. In [7] a semi-streaming algorithm is given for testing if 
a graph is connected and for creating a bipartition of the edges or stating that 
there is not any. In this paper a semi-streaming algorithm is presented that 
creates a minimum spanning tree of a weighted graph, as well as one that calcu- 
lates all cut-vertices of a graph. There are approaches in [7], [14] to approximate 
a maximum matching in unweighted and weighted graphs. In [8] the authors 
use the idea of a spanner to develop approximations for all-pair shortest paths, 
diameter and girth of a graph. 

On the other hand there are some results showing the limits of the semi- 



2 



streaming model. We just name two examples. Testing connectivity of a di- 
rected graph is not possible in the semi-streaming model [7] and for general 
graphs a breadth-first search tree cannot be created in a constant number of 
passes over the input [8]. 

fe-Connectivity. The notion of k- connectivity of a graph arises for example 
by looking at telecommunication networks. These networks have to be robust, 
even in the case of failures a user of this network should be guaranteed to 
reach every other user. On the network modeled as a graph, routers and users 
being the vertices, cables between them being the edges, we could ask how many 
vertices may fail such that the network is still serving a connection between every 
pair of users. A graph G(V, E) is said to be k-vertex connected or k-connected 
if after the removal of any k — 1 vertices G is still connected, that is, it contains 
a path between each pair of vertices. As a classical topic of graph theory this 
problem has been extensively studied in both the directed and undirected case, 
see [17] for an overview. We only consider the undirected case throughout this 
paper and we can find the largest k such that a given undirected graph is fc- 
connected using a variety of algorithms for example one due to Gabow [9] which 
runs in time 0(n + min{fc 5 / 2 , fcn 3 / 4 }fcn). 

k- Connectivity in the Semi- Streaming Model. The situation in the 
semi-streaming model is quite different. So far only semi-streaming algorithms 
for specifying fc-connectivity for k < 4 are known. For 1-connectivity, which 
is just connectivity, in [7] a semi-streaming algorithm is given that needs only 
one pass over the input stream and processes each input edge in time 0(a(n)), 
where a(n) is the extremely slowly growing inverse of the Ackermann function 
[18]. For k = 2,3,4 the authors of [8] present an adoption of a sparsification 
technique of [5]. That leads to semi-streaming algorithms for testing 2- and 
3-connectivity in time 0(a(n)) per edge and for testing 4-connectivity in time 
O(logn) per edge. These approaches can also be used to identify /-separators 
of G for I < k. An /-separator of a graph G is a set of / vertices whose removal 
leaves a graph with more connected components than G. However, there is no 
semi-streaming algorithm determining if a given graph is fc-connected for any 
constant k > 4, not to mention one to find /-separators for constant / > 3. 

Our Contributions. In this paper we present the first two semi-streaming 
algorithms for determining if a given graph is fc-connected for k being an ar- 
bitrary constant. The first algorithm is an adoption of an online algorithm 
developed in [3] and [4]. It runs over the input only once and takes time 0(k 2 n) 
to process each input edge. The second algorithm has a faster processing time 
per input edge of 0(k + a(n)) but needs to read the input stream fc + 1 times. 
It is based on results in [4]. 

Both algorithms utilize the idea of a certificate. For a graph G a certificate 
for fc-connectivity is a subgraph of G such that the certificate is fc-connected if 
and only if G is fc-conncctcd. While reading the input the presented algorithms 
both construct a certificate that does not exceed the memory limitations of the 
semi-streaming model. Thus the algorithms can memorize their certificates and 
can make use of them to determine fc-connectivity of the input graph in a post- 
processing step without any further input. 
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Moreover the certificate of the second algorithm can be used not only to test 
for fc-connectivity but even for computing all ^-separators of a given graph for 
every I < k. 

2 Preliminaries and Definitions 

We denote by G a graph G(V, E) with vertex set V and edge set E. Let n = \V\ 
and to = \E\ be the number of vertices and the number of edges respectively. 
Throughout the whole paper G is an undirected, unweighted graph without 
multiple edges or loops. 

We call two vertices connected if there is a path between them. A graph G 
is connected if any pair of vertices in G is connected, a connected component of 
G denotes an induced subgraph C of G such that C is connected and maximal. 

Given a positive integer fc, a graph G with at least k + 1 vertices is said to 
be k-vertex connected or k-connected if the removal of any k — 1 vertices leaves 
the graph connected. 

A subset 5* of the vertices of G we call an / -separator, if I = \S\ and the 
graph obtained by removing S and all edges incident to S from G has more 
connected components than G. 

For two distinct vertices x, y in G we call two paths between x and y vertex- 
disjoint if they are internally vertex-disjoint, that is, have only the endpoint x 
and y in common. Using that we name k(x, y) the local connectivity between x 
and y, being the maximum number of vertex-disjoint paths between x and y in 
G. 

For any property V and graph G we define a subgraph G' — (V, E'), E' C E, 
to be a certificate for G in the case that G has property V if and only if G' has 
property V. Thus a certificate for fc-connectivity of G is a subgraph G' of G 
such that G is fc-connected if and only if G' is fc-connected. A certificate G' 
is said to be a sparse certificate if G' has a linear number of edges, that is, 
\E'\ = 0(n). 

A graph stream of a graph G is a sequence of the m edges of G in arbi- 
trary oder. There is no restriction on the order of the edges, for example it is 
not required that all edges incident with a vertex are grouped together in the 
sequence. If we consider a graph stream as an input we mean that the edges 
are revealed one at a time. A semi- streaming graph algorithm computes over a 
graph stream as an input and is allowed to use a space of at most O(n-polylogn) 
bits. The algorithm may access the input stream for P passes in a sequential 
one-way order and use time T to process each single edge. 

At some places in the paper we use the function a(n) which is the inverse 
of the extremely quickly growing Ackermann function. Therefore a(n) is very 
slowly growing [18]. 
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3 Testing fc-Connectivity in the Semi-Streaming 
Model 



In this section we present an algorithm to test a graph G for fc-connectivity 
in the semi-streaming model for fc being an arbitrary positive constant. Our 
approach uses the concept of sparse certificates. In the next two subsections 
we develop two different semi-streaming algorithms A\ and A 2 , each of them 
computing a sparse certificate for fc-connectivity of a given graph G. 

The basic idea for both algorithms is the same: While processing the graph 
stream of G a sparse certificate is built up, C{A\) by A\ and C(A 2 ) by A 2 . At 
the end the certificate is used in a postprocessing step of the algorithm to decide 
fc-connectivity of G without any further input. 

Since the certificates are to be memorized by the algorithms they have to 
be sparse, i.e., consist only of a linear number of edges. There is a lower bound 
of fcn/2 on the number of edges in a certificate for a fc-connected graph. That 
follows from the fact that in such a graph each vertex has to be of degree at 
least fc. So a certificate G' for a fc-vertex connected graph has minimum degree 
S(G') > k and therefore consists of at least fcn/2 edges. 

It is easy to construct a certificate with a minimum number of edges for 
I-connectivity which is just a spanning tree. This can be done in the semi- 
streaming model in one pass and per-edge processing time of 0{a(n)) [7]. But 
in general we cannot go for certificates with a minimum number of edges, this 
problem is ./VP-complete [10]. Even for fc = 2 computing a minimum certificate 
for fc-connectivity of a graph G is ./VP-complete since it tells us if G contains a 
hamiltonian circuit. 

To our purposes it suffices to generate certificates not of minimum but of 
linear size. There are several approaches for this aim. In [16] the authors present 
a linear time algorithm that generates a certificate for fc-connectivity of linear 
size. We do not know if this algorithm can be adopted to the semi-streaming 
model. In [3] it is shown that the sequential execution of fc breadth-first searches 
and a union of their trees yields a sparse certificate for fc-connectivity. This 
approach is not suitable for the semi-streaming model since by a result of [8] 
computing a breadth-first search tree in the semi-streaming model cannot be 
done in a constant number of passes over the input. 

In this paper we use different approaches to construct a sparse certificate 
for fc-connectivity. Our first algorithm A x is an adoption of an online algorithm 
presented in [3] and [4]. The certificate G' is built up by adding an input edge 
uv to G' if u and v are not fc-connected in G'. The resulting certificate consists 
of at most 2fcn edges [4]. 

Our second algorithm A 2 uses the fact that the union of the forests of fc 
sequentially executed scan-first searches yields a certificate for fc-connectivity 
[4]. The number of edges in this certificate as a union of forests is at most 
k(n — 1). We utilize this approach for the semi-streaming model by presenting 
the semi-streaming algorithm A 2 that computes a union of fc scan-first search 
forests. 
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The two algorithms A\ and A 2 presented in the next two subsections differ 
in various ways: They have different per-edge processing time T, run a different 
number of passes P over the input stream and produce different certificates. 
While for the first algorithm T(A\) = 0(k 2 n) and it uses only one pass over the 
input, i.e., P(A\) = 1, for the second algorithm we have T(A 2 ) — 0(k + a(n)) 
and P(A 2 ) = k + 1. Furthermore the certificate C(A 2 ) produced by the second 
algorithm is more powerful than C{A\). It can not only be used to test the input 
graph G for k- connectivity as C(Ai) can, but allows to generate all /-separators 
of G where I < k. 

3.1 Slow Edge-Processing, One Pass 

In this subsection we present a semi-streaming algorithm that in one pass goes 
over the graph stream of an input graph G. It uses 0(k 2 n) time to process each 
input edge and creates a certificate C(Ai) for A:- vertex connectivity of G. 

To derive this algorithm A\ we adopt the online algorithm of Cheriyan and 
Thurimella [3] and Cheriyan, Kao and Thurimella [4] respectively. Their algo- 
rithm runs over a graph stream of the input graph G in one pass and constructs 
a certificate for /c-connectivity of G. We follow their approach but since they 
consider no memory restrictions we have to make sure that using this approach 
does not exceed the memory limitations of the semi-streaming model. 

The algorithm itself is simple. At the beginning the certificate C(Ai) is 
empty. For each input edge uv in the input stream we test whether the current 
certificate C{Ai) as a subgraph of G contains at most k — 1 vertex-disjoint paths 
between u and v. If so, we add the edge uv to C(Ai), otherwise the certificate 
remains unchanged and A\ forgets about this edge and examines the next one. 

Since our algorithm processes the edges in the way the algorithm in [4] does, 
we can assert that A\ indeed constructs a certificate for fc-connectivity of the 
input graph. 

Lemma 1 ([4]) // the in-put graph G = (V,E) is k-connected, then the final 
certificate C(Ai) is k-connected. □ 

We claim that we can keep C{A\) in the memory of A\ to allow testing of 
the k- connectivity of G using C(A\) in a postprocessing step. For that reason 
we have to ensure that C(A\) does not exceed the memory limitations of the 
semi-streaming model of O (n- poly log n) bits. We follow the track of [4], where it 
is shown that, using a result of Mader [2], the number of edges in the certificate 
is linear. 

Lemma 2 ([4]) The final certificate C(Ai) has at most 2kn edges. □ 
We can now formulate the final theorem for A\ completing this subsection. 
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Theorem 3 Given a graph stream of a graph G as an input, the algorithm A\ 
constructs a certificate C{A{) for k- connectivity ofG. While doing this, A\ uses 
time 0(k 2 n) per edge. After reading all edges, A\ is able to decide k- connectivity 
of G in a postprocessing step. The space used in total is 0(n ■ polylogn) bits. 

Proof. Since C(A\) is a subgraph of G it is immediate that if C(A\) is fc- 
connected, G must be fc-connected as well. Together with Lemma 1 it follows 
that the final C(A\) is a certificate for ^-connectivity of G. 

For each input edge uv A\ has to check whether there are at most k— 1 vertex- 
disjoint paths between u and v in C(A\). This can be done by constructing a 
flow from u to v with all node capacities set to one. To compute the flow we use 
the algorithm of Even and Tarjan [6]. It runs within the memory limitations of 
0(n ■ polylogn) bits since the space used by this algorithm is proportional to 
the number of edges in the current certificate which has linear size by Lemma 
2. 

The flow between u and v has to be computed only up to a value of k and the 
algorithm of Even and Tarjan can do so in time C(min{fc, n^jm) on a graph 
with n vertices and m edges. With m = 0{kn) in our certificate and k being a 
constant a time of 0(k 2 n) suffices to test for k vertex-disjoint paths between u 
and v for every input edge uv. 

The final C(Ai) fits in 0(n ■ polylogn) bits by Lemma 2. Since it is a cer- 
tificate for fc-connectivity of G, A\ can test the final C{A\) for fc-connectivity 
to decide ^-connectivity of G. This test can be done in a postprocessing step 
without any further input. To this end we use the fc-connectivity algorithm 
of Gabow [9] that runs in time 0(n + min{fc 5 / 2 , fcn 3 / 4 }fcn) and, which is more 
important, uses a space linear in the size of C(Ai). Consequently the post- 
processing step does not exceed the memory limitations of 0(n ■ polylogn) bits 
either and A\ is indeed a semi-streaming algorithm. □ 



3.2 Fast Edge-Processing, k + 1 Passes 

In this subsection the semi-streaming algorithm A2 is presented, which runs 
over the graph stream input for k + 1 times and examines each edge in time 
0(k + a(n)). It constructs a certificate C(A2) which can finally be used in a 
postprocessing step to test the input graph G for fc-connectivity. 

Our algorithm A 2 uses the idea of scan-fist search due to Cheriyan, Kao and 
Thurimella [4] . Scan-first search is a way of systematically marking the vertices 
of a given graph and works as follows. For a connected graph G we begin with 
one arbitrary starting vertex marked and all other vertices unmarked. On a 
marked vertex v we can do the main step called a scan of v: That is marking all 
non-marked neighbors of v. After that step v is called scanned and there will 
be no scanning of v again. In that fashion scan-first search iteratively marks all 
unmarked vertices and scans all unscanned but marked vertices of G until all 
vertices are scanned. 
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Scan-first search in a connected graph G yields a spanning tree T in the 
following way. At the beginning T is empty. If we scan a vertex v we add to 
T all edges from v to its unmarked neighbors. Such a tree is called scan- first 
search tree. 

We can get a consecutive numbering of the vertices by taking the order in 
which the vertices are scanned. We call such a numbering a scan-first search 
numbering. 

For applying scan-first search to unconnected graphs G we can successively 
perform scan-first search to every connected component of G. That produces a 
scan-first search tree for every connected component, the union of them we call 
a scan-first search forest. 

Note that scan-first search is a generalization of both depth-first search and 
breadth-first search. If there is more than one vertex marked and unscanncd 
and therefore more than one vertex can be chosen to be scanned next, scan-first 
search can take any of these vertices. By choosing in every step that vertex 
which was marked most recently scan-first search performs a depth-first search. 
If that vertex is chosen that has been marked for the longest time scan-first 
search proceeds in a breadth-first search manner. 

The next theorem shows how we will make use of scan-first search forests to 
obtain a certificate for fc-conncctivity. 

Theorem 4 ([4]) Given an undirected graph G(V,E) with n vertices and a 
positive integer k. For i = 1,2, ... ,k let F, be the edge set of the scan-first 
search forest in the graph = (V, E\ (Fi U . . . UFi_i)). Then F 1 U-- -L)F k is 

a certificate for the k- connectivity of G and this certificate has at most k(n — 1) 
edges. □ 



If we want to apply the scan-first search approach in our algorithm A2, 
we have to show how scan-first search can be performed in the semi-streaming 
model. 

Lemma 5 There is a semi- streaming algorithm X that generates a scan-first 
search forest F of a graph G. To this aim X runs over the graph stream of G 
twice. X processes each edge in time 0(a(n)) in the first pass and in time 0(1) 
in the second pass over the input. 

Proof. In the first pass over the input X computes a spanning forest Z of 
G. This can be done using a disjoint set data structure D. We start with n 
singletons, i.e., n sets containing a single vertex each, and Z = 0. For every 
input edge uv X checks if u and v are in different sets of D. In that case uv is 
added to Z and the sets of u and v are joined. 

D can be maintained in the memory limitations of a semi-streaming algo- 
rithm since for every vertex v we only have to memorize a vertex representing 
the subset containing v. Augmented with path compression and union by rank 
[18] the operations on D take time 0(a(nj) for each input edge. 

At the end of the first pass over the input X has constructed Z, which, as a 
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spanning forest of G, has at most n— 1 edges and can be stored in O(n-polylog n) 
bits. 

After reading the input the first time but before reading it again X performs 
a depth-first search on Z as an intermediate step. The preorder numbering 
0, . . . , n — 1 of the vertices according to this depth-first search on Z yields an 
order o : {0, . . . , n — 1} — > V of the vertices, let o(t) be the vertex at position 
t in that order. While building the order o at the same time X can construct 
o -1 , that is, for every vertex v o _1 (i>) being the position of v in the order o. 

The depth- first search can be done in time 0(n), leaving the amortized time 
for processing an edge in the first run over the graph stream unchanged. To 
do the depth-first search and to store the order of the vertices and its reverse, 
surely 0(n ■ polylogn) bits suffice. 

Note that this very order of the vertices can also be produced as a numbering 
of a certain scan-first search run R of G. R starts at vertex s = o(0). In a 
connected graph G for the order o holds, that for every < a < n — 1 the vertex 
u = o(a) is adjacent in G to a vertex v such that v = o(b), b < a. So R can scan 
the vertices in order o: If R at step < a < n — 1 should scan vertex u — o(a), 
there must be a vertex v which has been scanned before and is adjacent to 
u. Therefore u is marked and can be scanned in step a. For an unconnected 
graph G this argumentation can be extended to a sequential execution of the 
connected components. 

The aim of X in the second pass over the graph stream is to simulate the 
scan-first search run R to produce the scan-first search forest F that R would 
produce on G. Before starting the second pass over the input X knows about 
the sequence o in which R would scan the vertices of G and will, needless to 
say, make use of this order. 

At the beginning F is empty. For each input edge uv in the second pass X 
looks at the positions o _1 (u), o _1 (w) of the vertices in the order o. Without loss 
of generality, let o _1 (u) < o _1 (i;). If there is no neighbor w oft) in F such that 
o^ 1 (w) < o _1 (w) we add the edge uv to F. If otherwise there is a neighbor x of 
v in F with o^ 1 (x) < o _1 (u), X compares o^ 1 (x) and o _1 (u). In the case that 
o _1 (u) > o^ 1 (x), X leaves F unchanged, forgets about uv and proceeds to the 
next input edge. If in the contrary o _1 (u) < o^ 1 (x) the edge vx is deleted from 
F and uv is inserted instead. 

After processing the whole graph stream, for each uv e F the following 
holds: If o _1 (u) < o _1 (w), u is the only neighbor of v among all other neighbors 
of v that are preceding v in the order o. Moreover u is preceding all other 
neighbors of v. 

For this reason F is exactly the scan-first search forest that R would produce 
on G scanning the vertices in sequence o. An edge uv, u preceding v, is only 
put in the scan-first search forest of R if, during the scan of u, the unmarked 
neighbor v of u is marked. For all other neighbors x of v succeeding u but 
preceding v the edge xv is not added to the scan-first search forest since at the 
time x is scanned, v is already marked. So o~ l {u) < o^ 1 (x) for every other 
neighbor x of v. 

For the second pass over the graph stream X can perform all necessary 
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operations in time 0{\) per edge. Since F as a forest consists of at most n — 1 
edges, X can maintain F in 0(n ■ polylogn) bits. □ 



Now we can state our main theorem in this subsection on how A 2 computes 
its C(A 2 ) as a union of the edges of k scan-first search trees. 

Theorem 6 Given an undirected graph G(V, E) with n vertices and a positive 
integer k. For i = 1, 2, . . . , k let Fi be the edge set of the scan- first search forest 
in the graph Gi-i — (V,E\ (Fi U . . . U A semi-streaming algorithm A 2 

can compute C(A 2 ) := F\ U • • • UFfc using k + 1 passes over the graph stream of 
G as input, Fi is computed in pass i + 1 . A 2 processes each input edge in time 
0(k + a(nj). The final C(A 2 ) can be used to decide k- connectivity of G in a 
postprocessing step. 

Proof. A 2 uses k nested instances of the semi-streaming algorithm X of Lemma 
5. Let Xi, 1 < i < k, be the ith instance of X called by A 2 . Xi is called at the 
beginning of pass i and lasts for two passes of A 2 , i.e., is finished after pass i + 1 
of A 2 . That way in pass i of A 2 Xi is running its first pass while X t _i performs 
its second pass. 

We will show that every Xi computes F i} a scan-first search forest on Gi-\. 
It suffices to show how each Xj does not work on the entire graph G but on 
Gi-i with the reduced edge set E \ {Fx U . . . U Fj_i)), since we know due to 
Lemma 5 that each Xi constructs a valid scan-first search forest for the graph 
that is presented to it. 

In the first pass of each Xi, i > 1, Xi docs not see the input of A 2 , that is, 
does not see the graph stream input edges. It gets edges handed over by Xj_i 
as input edges. (This handing over of edges from X,_i is described in the next 
paragraph since it corresponds to the handing over from Xi to Xj + i.) On this 
input handed over by Xj_i, Xi computes an odcring Oi of the vertices that a 
scan-first search run on these edges in G might generate as a scan-first search 
order. The first instance X\ gets the original graph stream edges as input edges. 

In the second pass of Xi (which is pass i + 1 of A 2 ) Xi directly processes 
the input edges of A 2 . For each input edge uv in the input stream Xi checks 
whether uv is an edge in Fj, the forest that was computed by Xj for all j < i. 
This can be done in time O(k) as follows. 

For each Xj A 2 stores the computed forest Fj , the order of the vertices oj 1 
used by Xj to build Fj and for each vertex v the position oJ 1 (w) of the at most 
one neighbor w of v in Fj that is preceding v in Oj. (See Lemma 5 why this is 
at most one neighbor.) To test if the input edge uv is in Fj Xi looks at oJ 1 (m) 
and oJ 1 (w). Let w.l.o.g. oJ 1 (m) < oJ 1 (w). Let p be the position of the at most 
one neighbor of v in Fj that is proceeding v in Oj. If p = o i _1 (u) the edge is in 
Fj otherwise uv is not in Fj, since v has only one neighbor in Fj preceding v. 
This way Xi can test for every input edge uv if it is part of one Fj, j < i, in 
constant time. Since j < i < k all Fj can be checked for uv in time 0{k). 

If Xi finds the input edge uv existing in one of the Fj, j < i, it skips the 
edge and proceeds to the next input edge. In this case no edge is handed over 
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to Xi+i that executes its first pass. If otherwise uv £ Fj Vj < i, Xi uses uv to 
build its scan- first search forest F as specified in Lemma 5. If uv is not inserted 
in Fi, it is handed over to X i+1 . If uv is added to Fj and xy removed from Fj 
in exchange, xy is handed over to Xi + \. In the remaining case that uv is added 
to Fi and no edge is removed from Fi no edge is handed over to Xi + \. 

Using the described operations Xi computes the scan-first search forest of 
Gi-i = {V,E\ (Fl U . . . U In the first pass of each Xi it does not see the 

edges in F\ U • • • U Fj_ 2 since they are skipped by and not handed over to 
Xi. Moreover only hands over to Xi those edges that are not in Fj_i. So 
in the first pass of each X, it builds an order Oi of the vertices that a scan-first 
search run might generate on Gi. In the second pass every Xi only uses the 
input edges uv Fx U • • • U Fj_i to build its Fj according to o,. That yields a 
scan-first search forest due to Lemma 5. 

It remains to show the claimed time bounds and the memory limitations of 
A 2 . In the first pass of A 2 only X\ is running using a processing time of 0(a(n j) 
per edge. In pass i, 1 < i < fc + 1, Xi is running in its first pass getting at most 
one edge handed over from Xj_i per input edge of A 2 and processing each of 
these edges in 0{a(n)). In the same pass i Xi_\ is executing its second pass. 
While doing so Xi_i tests each input edge of A 2 for existence in Fl U • • • U F_ 2 
in time O(k) and building its F t _i in 0(1) per edge. In the last pass of A 2 only 
X k is running checking the input edges for existence in the previous forests and 
building its scan-first search forest according to o k in time O(k) per input edge. 

Since each Xi uses 0(n ■ polylogn) bits a constant number of k instances 
can not overrun the memory boundaries of the semi-streaming model. Storing 
the - at most one - neighbor of a vertex preceding this vertex for each Fi does 
not violate these boundaries either. 

After k + 1 passes each Xi computed its Fi and A 2 can merge these Fi 
producing C(A 2 ) as a certificate for /c-connectivity of G. In a postprocessing 
step A 2 can use C(A 2 ) to test G for fc-connectivity executing on C(A 2 ) the k- 
connectivity algorithm of Gabow [9] that runs in time 0(n+mm{k 5 ^ 2 , kn 3 ^ 4 }kn) 
and using linear space in the size of C(A 2 ). □ 

The certificate C(A 2 ) produced by A 2 is more powerful than the one of A\. 
Cheriyan, Kao and Thurimella [4] strengthened Theorem 4 in the following way. 

Theorem 7 ([4]) Given an undirected graph G(V,E) with n vertices and a 
positive integer k. For i = 1,2, ... ,k let Fi be the edge set of the scan-first search 
forest in the graph G t -i = (V,E\(F 1 U...UF i - 1 )). ThenG k = (V, Fx U- • -UF k ) 
and G have the same I -separators for all I < k. □ 

So A 2 can use its computed certificate C(A 2 ) in a postprocessing step to 
identify all /-separators for any I < k of the given graph G: For every pair of 
vertices u, v A 2 runs the algorithm of Even and Tarjan [6] to determine if there 
are at most / < k vertex-disjoint paths between u and v. If so, any set consisting 
of one internal vertex of every of these I paths is an l-separator in C(A 2 ) and 
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thus in G by Theorem 7. The space used by the algorithm of [6] is linear in 
the number of edges in C(A 2 ) and every ^-separator with I < fc is of constant 
size. Therefore all Z-separators, I < fc, of a given graph can be computed and 
memorized in a postprocessing step of Ai without exceeding the boundaries of 
the semi-streaming model. 

4 Conclusions and Open Questions 

We extended the possibility of testing graph fc-connectivity in the semi-streaming 
model from k < 4 to k being an arbitrary constant. To this aim we presented 
two semi-streaming algorithms, both of them computing a sparse certificate for 
fc-connectivity of the input graph G. In a postprocessing step each algorithm 
can use its constructed certificate to decide fc-connectivity of G without exceed- 
ing the limits of the semi-streaming model. The second algorithm can use its 
certificate to generate all /-separators for all / < k. 

Due to the memory limitations of the semi-streaming model our approaches 
cannot be applied for non-constant fc. We do not know if fc-connectivity for 
non-constant k can be determined in the semi-streaming model and what ap- 
proaches might be suitable. 

For some real-world applications it is not feasible to allow more than one 
pass over the input stream. Therefore it is desirable to combine the fast per-edge 
processing time of our second algorithm with the modesty of our first algorithm 
reading the input only once. 
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